AITopics | slow memory

Collaborating Authors

slow memory

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

788d986905533aba051261497ecffcbb-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 01:12:14 GMT

fast memory, hm-ann, slow memory, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Merced County > Merced (0.04)
North America > United States > Nevada (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > British Columbia > Vancouver (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.66)

Add feedback

A Theory of I/O-Efficient Sparse Neural Network Inference

Gleinig, Niels, Ben-Nun, Tal, Hoefler, Torsten

arXiv.org Artificial IntelligenceJan-3-2023

As the accuracy of machine learning models increases at a fast rate, so does their demand for energy and compute resources. On a low level, the major part of these resources is consumed by data movement between different memory units. Modern hardware architectures contain a form of fast memory (e.g., cache, registers), which is small, and a slow memory (e.g., DRAM), which is larger but expensive to access. We can only process data that is stored in fast memory, which incurs data movement (input/output-operations, or I/Os) between the two units. In this paper, we provide a rigorous theoretical analysis of the I/Os needed in sparse feedforward neural network (FFNN) inference. We establish bounds that determine the optimal number of I/Os up to a factor of 2 and present a method that uses a number of I/Os within that range. Much of the I/O-complexity is determined by a few high-level properties of the FFNN (number of inputs, outputs, neurons, and connections), but if we want to get closer to the exact lower bound, the instance-specific sparsity patterns need to be considered. Departing from the 2-optimal computation strategy, we show how to reduce the number of I/Os further with simulated annealing. Complementing this result, we provide an algorithm that constructively generates networks with maximum I/O-efficiency for inference. We test the algorithms and empirically verify our theoretical and algorithmic contributions. In our experiments on real hardware we observe speedups of up to 45$\times$ relative to the standard way of performing inference.

artificial intelligence, machine learning, neuron, (20 more...)

arXiv.org Artificial Intelligence

2301.01048

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback